The document provides an overview of video coding techniques used in video compression standards. It discusses how video compression exploits both the spatial and temporal redundancy in video signals. Key techniques covered include motion-compensated prediction, where a current frame is predicted from previously coded reference frames using motion vectors, and block-based motion estimation to determine the motion vectors. The document also outlines the generic architecture of video compression systems, which apply representation, quantization, and binary encoding steps to remove redundancy from video signals.
The document provides an overview of the status of MPEG-4 developments and the AIC Initiative. It discusses the goals, history, and architecture of MPEG-4, which aims to code audio-visual objects and scenes to enable interactivity. MPEG-4 extends existing architectures like MPEG-2 and IP to new environments through tools like an interactive scene description and support for new content types and delivery formats. Profiles and levels are defined to suit different applications. Carriage of MPEG-4 over MPEG-2 and IP is also addressed.
This document provides an overview of key concepts in multimedia systems including digital video formats, properties of video such as frame rate and aspect ratio, video compression techniques, and video production equipment and processes. It covers analog vs digital video, interlacing vs progressive scanning, common video file formats like AVI, MOV, and MPG, and how to transfer video from a camcorder to a computer.
This document discusses different digital video technologies including desktop video formats, software and hardware codecs, DVD output, and video editing software systems. It covers popular formats like QuickTime, Video for Windows, MPEG, and RealPlayer. It also discusses hardware like DVD players, encoder/decoder cards, and semi-professional digital video editing solutions that allow capturing, editing and outputting video to tape or file.
This document discusses the transition from tape-based to file-based workflows in audiovisual production. It covers the history from linear tape-based workflows to emerging digital file-based workflows. Key benefits of file-based workflows include the ability to store rich metadata with media files throughout the production process, faster editing and sharing of content, and more flexible archiving and retrieval of content and associated metadata. Standards organizations are helping facilitate file-based workflows by establishing standards for file formats and metadata. Specific Sony products like XDCAM are highlighted as examples of technologies supporting file-based acquisition, editing and distribution.
This document provides an analysis of an interactive menu for the film Greenstreet. The menu features faded film clips in the background with a red title over green grass. Animation, visual effects, color rendering, and movement are used but there is no rotation. Techniques like blur, sharpening, and opacity are applied to the clips. The menu is formatted for H.264/MPEG-4 AVC video, which offers high quality at low file sizes but requires more encoding time and hardware. The analysis concludes with the course unit on motion graphics and video compositing.
Digital video can be recorded and edited on a computer. It is stored using file formats like AVI, MOV, MPEG, and FLV which determine compatibility and file size. Digital video is composed of individual frames that have a rate, size, and color depth. Video editing software allows cutting, combining, and adding effects to video clips. Captured digital video can be used in multimedia products like presentations, websites, and games.
This document provides an overview of video formats, which involve containers and codecs. The container describes the file structure and can contain different codecs. The codec is how the video is encoded and determines quality. Popular containers include MP4, MOV, and AVI, while common codecs are H.264, MPEG-4, and DivX. Choosing a format depends on factors like file size, quality, frame rate, bitrate, and resolution, as well as how the video will be transmitted or shared.
The document provides an overview of the status of MPEG-4 developments and the AIC Initiative. It discusses the goals, history, and architecture of MPEG-4, which aims to code audio-visual objects and scenes to enable interactivity. MPEG-4 extends existing architectures like MPEG-2 and IP to new environments through tools like an interactive scene description and support for new content types and delivery formats. Profiles and levels are defined to suit different applications. Carriage of MPEG-4 over MPEG-2 and IP is also addressed.
This document provides an overview of key concepts in multimedia systems including digital video formats, properties of video such as frame rate and aspect ratio, video compression techniques, and video production equipment and processes. It covers analog vs digital video, interlacing vs progressive scanning, common video file formats like AVI, MOV, and MPG, and how to transfer video from a camcorder to a computer.
This document discusses different digital video technologies including desktop video formats, software and hardware codecs, DVD output, and video editing software systems. It covers popular formats like QuickTime, Video for Windows, MPEG, and RealPlayer. It also discusses hardware like DVD players, encoder/decoder cards, and semi-professional digital video editing solutions that allow capturing, editing and outputting video to tape or file.
This document discusses the transition from tape-based to file-based workflows in audiovisual production. It covers the history from linear tape-based workflows to emerging digital file-based workflows. Key benefits of file-based workflows include the ability to store rich metadata with media files throughout the production process, faster editing and sharing of content, and more flexible archiving and retrieval of content and associated metadata. Standards organizations are helping facilitate file-based workflows by establishing standards for file formats and metadata. Specific Sony products like XDCAM are highlighted as examples of technologies supporting file-based acquisition, editing and distribution.
This document provides an analysis of an interactive menu for the film Greenstreet. The menu features faded film clips in the background with a red title over green grass. Animation, visual effects, color rendering, and movement are used but there is no rotation. Techniques like blur, sharpening, and opacity are applied to the clips. The menu is formatted for H.264/MPEG-4 AVC video, which offers high quality at low file sizes but requires more encoding time and hardware. The analysis concludes with the course unit on motion graphics and video compositing.
Digital video can be recorded and edited on a computer. It is stored using file formats like AVI, MOV, MPEG, and FLV which determine compatibility and file size. Digital video is composed of individual frames that have a rate, size, and color depth. Video editing software allows cutting, combining, and adding effects to video clips. Captured digital video can be used in multimedia products like presentations, websites, and games.
This document provides an overview of video formats, which involve containers and codecs. The container describes the file structure and can contain different codecs. The codec is how the video is encoded and determines quality. Popular containers include MP4, MOV, and AVI, while common codecs are H.264, MPEG-4, and DivX. Choosing a format depends on factors like file size, quality, frame rate, bitrate, and resolution, as well as how the video will be transmitted or shared.
Digital video has replaced analog video as the preferred method for making and delivering video content in multimedia. Video files can be extremely large, so compression techniques like MPEG and JPEG are used to reduce file sizes. There are two types of compression: lossless, which preserves quality, and lossy, which eliminates some data to provide greater compression ratios at the cost of quality. Digital video editing software allows for adding effects, transitions, titles and synchronizing video and audio.
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...IDES Editor
This document describes a hardware implementation of a genetic algorithm based digital color image watermarking system. The system embeds a watermark image into the luminance channel (Y channel) of a host color image after converting the image from RGB to YUV color space. A genetic algorithm is used to determine optimal intensity values in the host image for embedding the watermark image bits invisibly. The proposed design is implemented as a custom integrated circuit for real-time watermarking of images as they are captured by a digital camera. Synthesis results show that the design can operate at 5ns clock speed and consumes a maximum power of 73.84mW when implemented on an Altera Cyclone II FPGA.
This document discusses digital video codecs and compression. It begins by defining pixel resolutions for standard definition, high definition, and digital cinema. It then covers CMOS image sensors used for HD, 2K and 4K capture and explains intra-frame and inter-frame compression. The document provides an example of the Apple ProRes 422 codec and analyzes its key attributes. It also discusses interlaced vs progressive scanning, picture impairments from compression, digital cinema standards, and predicts that advances in compression will continue to be needed to handle higher resolutions and frame rates.
The document discusses various analog and digital video interfaces. It describes common analog video interfaces like composite video, S-video, component video and RGB analog video. It then covers digital video interfaces such as HDMI, DVI, FireWire, S/PDIF and SDI. For each interface, it provides details on technical standards, maximum supported resolutions and example uses.
High-speed Distributed Video Transcoding for Multiple Rates ...Videoguy
This paper describes a distributed video transcoding system that can simultaneously transcode an MPEG-2 video file into various video coding formats with different rates. The transcoder divides the MPEG-2 file into small segments along the time axis and transcodes them in parallel on multiple PCs. Efficient video segment handling methods are proposed that minimize the inter-processor communication overhead and eliminate temporal discontinuities from the re-encoded video.
Generic Video Adaptation Framework Towards Content – and Context Awareness in...Alpen-Adria-Universität
This document proposes a generic video adaptation framework that is content- and context-aware. It introduces a unified adaptation format based on H.264 features to enable codec reusability and various adaptation algorithms. The framework includes format adapters between decoders, an adaptation pool, and encoders to convert between their formats. It aims to allow any type of adaptation like bitrate or resolution changes across any codec generically while maintaining quality and real-time constraints.
CSEP Acquisition Preparation Technical Training Course SamplerJim Jenkins
The DoD acquisition process involves taking a system from concept to operation, maintenance and disposal. Key SE activities include:
- Defining stakeholder and technical requirements through concept of operations, requirements analysis and definition
- Developing the system architecture through functional analysis, allocation and architecture synthesis
- Implementing, integrating, verifying and validating the system
- Transitioning to operations, and maintaining and eventually disposing of the system
This covers the key SE roles outlined in section 4.1 of the Defense Acquisition Guidebook, including translating user needs to requirements and designing the system architecture. The course then delves into how SE is applied throughout the different phases of the DoD life cycle.
Euler's theorem states that for any plane graph, the number of vertices (v) minus the number of edges (e) plus the number of faces (f) equals 2. The document proves this theorem by considering a minimal tree (T) within the graph and its dual tree (D), showing that the number of edges of T and D sum to the total edges (e) of the original graph. Some applications of the theorem are that any plane graph contains an edge of degree 5 or higher and any finite set of points not all on a line contains a line with exactly two points.
Fundamentals Of Space Systems & Space Subsystems course samplerJim Jenkins
This course in space systems and space subsystems is for technical and management personnel who wish to gain an understanding of the important technical concepts in the development of space instrumentation, subsystems, and systems. The goal is to assist students to achieve their professional potential by endowing them with an understanding of the subsystems and supporting disciplines important to developing space instrumentation, space subsystems, and space systems. It designed for participants who expect to plan, design, build, integrate, test, launch, operate or manage subsystems, space systems, launch vehicles, spacecraft, payloads, or ground systems. The objective is to expose each participant to the fundamentals of each subsystem and their inter-relations, to not necessarily make each student a systems engineer, but to give aerospace engineers and managers a technically based space systems perspective. The fundamental concepts are introduced and illustrated by state-of-the-art examples. This course differs from the typical space systems course in that the technical aspects of each important subsystem are addressed.
This document provides an introduction to digital television. It discusses analog TV standards and the conversion to digital with ITU-R BT.601 and BT.709 standards defining digital video formats. It also describes MPEG transport streams, the DVB system for content delivery over satellite, cable and terrestrial networks, and conditional access systems. Packetized elementary streams (PES) and program specific information (PSI) tables are also introduced.
The RSA cryptosystem document discusses:
1) The RSA cryptosystem uses a public and private key to encrypt and decrypt messages based on large prime number factorization.
2) An example is provided where a message is encrypted with a public key and decrypted with a private key.
3) The security of RSA relies on the difficulty of factoring large numbers, as factorization algorithms take exponential time relative to the number of bits.
ATI's Total Systems Engineering Development & Management technical training c...Jim Jenkins
This three-day ATI professional development course, Total Systems Engineering Development & Management, course, covers four system
development fundamentals: (1) a sound
engineering management infrastructure within
which work may be efficiently accomplished, (2)
define the problem to be solved (requirements and
specifications), (3) solve the problem (design,
integration, and optimization), and (4) prove that
the design solves the defined problem
(verification).
The document discusses the MPEG-4 standard for multimedia coding and transmission. MPEG-4 allows coding of audio-visual objects rather than just pixels, supports content-based interactivity, and aims for universal access and high compression over a wide bitrate range. It describes MPEG-4 video coding including coding of video object planes using motion compensation and DCT, as well as shape coding using binary and grayscale alpha planes.
ATI's Systems Engineering - Requirements technical training course samplerJim Jenkins
This ATI professional
development course, Systems Engineering - Requirements, provides system engineers, team leaders, and managers with a clear understanding about how to develop good specifications affordably using modeling methods that encourage identification of the essential characteristics that must be respected in the subsequent design process.
Fundamentals of Engineering Probability Visualization Techniques & MatLab Cas...Jim Jenkins
This four-day course gives a solid practical and intuitive understanding of the fundamental concepts of discrete and continuous probability. It emphasizes visual aspects by using many graphical tools such as Venn diagrams, descriptive tables, trees, and a unique 3-dimensional plot to illustrate the behavior of probability densities under coordinate transformations. Many relevant engineering applications are used to crystallize crucial probability concepts that commonly arise in aerospace CONOPS and tradeoffs
H.120 was the first digital video coding standard developed in 1984. H.261 in the late 1980s was the first widespread success and established the modern structure for video compression that is still used today. MPEG-1 and MPEG-2/H.262 built upon H.261 with improvements like bidirectional prediction and half-pixel motion compensation. H.263 further enhanced compression performance and is now dominant for videoconferencing, adding features such as overlapped block motion compensation.
This three-day course is designed for engineers, scientists, project managers and other professionals who design, build, test or sell complex systems. Each topic is illustrated by real-world case studies discussed by experienced CONOPS and requirements professionals. Key topics are reinforced with small-team exercises. Over 200 pages of sample CONOPS (six) and templates are provided. Students outline CONOPS and build OpCons in class. Each student gets instructor’s slides; college-level textbook; ~250 pages of case studies, templates, checklists, technical writing tips, good and bad CONOPS; Hi-Resolution personalized Certificate of CONOPS Competency and class photo, opportunity to join US/Coalition CONOPS Community of Interest.
This document provides an overview of satellite communications fundamentals. It discusses how satellites provide capabilities not available through landlines, such as mobility and quick implementation. However, satellites are not always the most cost effective solution due to limited frequency spectrum and spatial capacity. The document describes different types of satellite services and configurations, including geostationary and non-geostationary satellites. It also covers topics like frequency reuse, earth station antennas, and satellite link delays.
Applied Physical Oceanography And ModelingJim Jenkins
This three-day course is designed for engineers, physicists, acousticians, climate scientists, and managers who wish to enhance their understanding of this discipline or become familiar with how the ocean environment can affect their individual applications. Examples of remote sensing of the ocean, in situ ocean observing systems and actual examples from recent oceanographic cruises are given.
The students will be able to access educational Java applets to visualize waves and key acoustic phenomena: Click here to view
Other web-based resources include acoustic demonstration podcasts and iPod apps to conduct acoustic measurements. The student will also be armed with Internet resources for up-to-date information on sonar systems, undersea sound propagation models, and environmental databases. The student will leave with a clear understanding of how the ocean influences undersea sound propagation and scattering.
Bioastronautics: Space Exploration and its Effects on the Human Body Course S...Jim Jenkins
This three-day course is intended for technical and managerial personnel who wish to be introduced to the effects of the space environment on humans. This course introduces bioastronautics from a fundamental perspective, assuming no prior knowledge of biology, physiology, or chemistry. The objective of the course is to provide the student with basic knowledge that will allow him or her to contribute more effectively to the human space exploration program. The human body, that through evolution is uniquely designed to function on the Earth, adapts to the space environment characterized by weightlessness and enhanced radiation. These alterations can impact the health and performance of astronauts, especially on return to the Earth.
Digital video has replaced analog video as the preferred method for making and delivering video content in multimedia. Video files can be extremely large, so compression techniques like MPEG and JPEG are used to reduce file sizes. There are two types of compression: lossless, which preserves quality, and lossy, which eliminates some data to provide greater compression ratios at the cost of quality. Digital video editing software allows for adding effects, transitions, titles and synchronizing video and audio.
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...IDES Editor
This document describes a hardware implementation of a genetic algorithm based digital color image watermarking system. The system embeds a watermark image into the luminance channel (Y channel) of a host color image after converting the image from RGB to YUV color space. A genetic algorithm is used to determine optimal intensity values in the host image for embedding the watermark image bits invisibly. The proposed design is implemented as a custom integrated circuit for real-time watermarking of images as they are captured by a digital camera. Synthesis results show that the design can operate at 5ns clock speed and consumes a maximum power of 73.84mW when implemented on an Altera Cyclone II FPGA.
This document discusses digital video codecs and compression. It begins by defining pixel resolutions for standard definition, high definition, and digital cinema. It then covers CMOS image sensors used for HD, 2K and 4K capture and explains intra-frame and inter-frame compression. The document provides an example of the Apple ProRes 422 codec and analyzes its key attributes. It also discusses interlaced vs progressive scanning, picture impairments from compression, digital cinema standards, and predicts that advances in compression will continue to be needed to handle higher resolutions and frame rates.
The document discusses various analog and digital video interfaces. It describes common analog video interfaces like composite video, S-video, component video and RGB analog video. It then covers digital video interfaces such as HDMI, DVI, FireWire, S/PDIF and SDI. For each interface, it provides details on technical standards, maximum supported resolutions and example uses.
High-speed Distributed Video Transcoding for Multiple Rates ...Videoguy
This paper describes a distributed video transcoding system that can simultaneously transcode an MPEG-2 video file into various video coding formats with different rates. The transcoder divides the MPEG-2 file into small segments along the time axis and transcodes them in parallel on multiple PCs. Efficient video segment handling methods are proposed that minimize the inter-processor communication overhead and eliminate temporal discontinuities from the re-encoded video.
Generic Video Adaptation Framework Towards Content – and Context Awareness in...Alpen-Adria-Universität
This document proposes a generic video adaptation framework that is content- and context-aware. It introduces a unified adaptation format based on H.264 features to enable codec reusability and various adaptation algorithms. The framework includes format adapters between decoders, an adaptation pool, and encoders to convert between their formats. It aims to allow any type of adaptation like bitrate or resolution changes across any codec generically while maintaining quality and real-time constraints.
CSEP Acquisition Preparation Technical Training Course SamplerJim Jenkins
The DoD acquisition process involves taking a system from concept to operation, maintenance and disposal. Key SE activities include:
- Defining stakeholder and technical requirements through concept of operations, requirements analysis and definition
- Developing the system architecture through functional analysis, allocation and architecture synthesis
- Implementing, integrating, verifying and validating the system
- Transitioning to operations, and maintaining and eventually disposing of the system
This covers the key SE roles outlined in section 4.1 of the Defense Acquisition Guidebook, including translating user needs to requirements and designing the system architecture. The course then delves into how SE is applied throughout the different phases of the DoD life cycle.
Euler's theorem states that for any plane graph, the number of vertices (v) minus the number of edges (e) plus the number of faces (f) equals 2. The document proves this theorem by considering a minimal tree (T) within the graph and its dual tree (D), showing that the number of edges of T and D sum to the total edges (e) of the original graph. Some applications of the theorem are that any plane graph contains an edge of degree 5 or higher and any finite set of points not all on a line contains a line with exactly two points.
Fundamentals Of Space Systems & Space Subsystems course samplerJim Jenkins
This course in space systems and space subsystems is for technical and management personnel who wish to gain an understanding of the important technical concepts in the development of space instrumentation, subsystems, and systems. The goal is to assist students to achieve their professional potential by endowing them with an understanding of the subsystems and supporting disciplines important to developing space instrumentation, space subsystems, and space systems. It designed for participants who expect to plan, design, build, integrate, test, launch, operate or manage subsystems, space systems, launch vehicles, spacecraft, payloads, or ground systems. The objective is to expose each participant to the fundamentals of each subsystem and their inter-relations, to not necessarily make each student a systems engineer, but to give aerospace engineers and managers a technically based space systems perspective. The fundamental concepts are introduced and illustrated by state-of-the-art examples. This course differs from the typical space systems course in that the technical aspects of each important subsystem are addressed.
This document provides an introduction to digital television. It discusses analog TV standards and the conversion to digital with ITU-R BT.601 and BT.709 standards defining digital video formats. It also describes MPEG transport streams, the DVB system for content delivery over satellite, cable and terrestrial networks, and conditional access systems. Packetized elementary streams (PES) and program specific information (PSI) tables are also introduced.
The RSA cryptosystem document discusses:
1) The RSA cryptosystem uses a public and private key to encrypt and decrypt messages based on large prime number factorization.
2) An example is provided where a message is encrypted with a public key and decrypted with a private key.
3) The security of RSA relies on the difficulty of factoring large numbers, as factorization algorithms take exponential time relative to the number of bits.
ATI's Total Systems Engineering Development & Management technical training c...Jim Jenkins
This three-day ATI professional development course, Total Systems Engineering Development & Management, course, covers four system
development fundamentals: (1) a sound
engineering management infrastructure within
which work may be efficiently accomplished, (2)
define the problem to be solved (requirements and
specifications), (3) solve the problem (design,
integration, and optimization), and (4) prove that
the design solves the defined problem
(verification).
The document discusses the MPEG-4 standard for multimedia coding and transmission. MPEG-4 allows coding of audio-visual objects rather than just pixels, supports content-based interactivity, and aims for universal access and high compression over a wide bitrate range. It describes MPEG-4 video coding including coding of video object planes using motion compensation and DCT, as well as shape coding using binary and grayscale alpha planes.
ATI's Systems Engineering - Requirements technical training course samplerJim Jenkins
This ATI professional
development course, Systems Engineering - Requirements, provides system engineers, team leaders, and managers with a clear understanding about how to develop good specifications affordably using modeling methods that encourage identification of the essential characteristics that must be respected in the subsequent design process.
Fundamentals of Engineering Probability Visualization Techniques & MatLab Cas...Jim Jenkins
This four-day course gives a solid practical and intuitive understanding of the fundamental concepts of discrete and continuous probability. It emphasizes visual aspects by using many graphical tools such as Venn diagrams, descriptive tables, trees, and a unique 3-dimensional plot to illustrate the behavior of probability densities under coordinate transformations. Many relevant engineering applications are used to crystallize crucial probability concepts that commonly arise in aerospace CONOPS and tradeoffs
H.120 was the first digital video coding standard developed in 1984. H.261 in the late 1980s was the first widespread success and established the modern structure for video compression that is still used today. MPEG-1 and MPEG-2/H.262 built upon H.261 with improvements like bidirectional prediction and half-pixel motion compensation. H.263 further enhanced compression performance and is now dominant for videoconferencing, adding features such as overlapped block motion compensation.
This three-day course is designed for engineers, scientists, project managers and other professionals who design, build, test or sell complex systems. Each topic is illustrated by real-world case studies discussed by experienced CONOPS and requirements professionals. Key topics are reinforced with small-team exercises. Over 200 pages of sample CONOPS (six) and templates are provided. Students outline CONOPS and build OpCons in class. Each student gets instructor’s slides; college-level textbook; ~250 pages of case studies, templates, checklists, technical writing tips, good and bad CONOPS; Hi-Resolution personalized Certificate of CONOPS Competency and class photo, opportunity to join US/Coalition CONOPS Community of Interest.
This document provides an overview of satellite communications fundamentals. It discusses how satellites provide capabilities not available through landlines, such as mobility and quick implementation. However, satellites are not always the most cost effective solution due to limited frequency spectrum and spatial capacity. The document describes different types of satellite services and configurations, including geostationary and non-geostationary satellites. It also covers topics like frequency reuse, earth station antennas, and satellite link delays.
Applied Physical Oceanography And ModelingJim Jenkins
This three-day course is designed for engineers, physicists, acousticians, climate scientists, and managers who wish to enhance their understanding of this discipline or become familiar with how the ocean environment can affect their individual applications. Examples of remote sensing of the ocean, in situ ocean observing systems and actual examples from recent oceanographic cruises are given.
The students will be able to access educational Java applets to visualize waves and key acoustic phenomena: Click here to view
Other web-based resources include acoustic demonstration podcasts and iPod apps to conduct acoustic measurements. The student will also be armed with Internet resources for up-to-date information on sonar systems, undersea sound propagation models, and environmental databases. The student will leave with a clear understanding of how the ocean influences undersea sound propagation and scattering.
Bioastronautics: Space Exploration and its Effects on the Human Body Course S...Jim Jenkins
This three-day course is intended for technical and managerial personnel who wish to be introduced to the effects of the space environment on humans. This course introduces bioastronautics from a fundamental perspective, assuming no prior knowledge of biology, physiology, or chemistry. The objective of the course is to provide the student with basic knowledge that will allow him or her to contribute more effectively to the human space exploration program. The human body, that through evolution is uniquely designed to function on the Earth, adapts to the space environment characterized by weightlessness and enhanced radiation. These alterations can impact the health and performance of astronauts, especially on return to the Earth.
Total systems engineering_development_management_course_samplerJim Jenkins
The document provides information about a training course on total systems engineering development and management from the Applied Technology Institute (ATI). It includes an outline of the course topics covering the system engineering life cycle from requirements to management. Additionally, it provides background on the instructor, Jeff Grady, and examples of structured analysis diagrams that are used in requirements analysis and system architecture definition. The course aims to teach proven practices for applying systems engineering principles across diverse product domains.
This document specifies how to encapsulate MPEG-2 Transport Stream data within DAB MSC stream data sub-channels, including adding error protection. It describes using Reed-Solomon coding and interleaving to provide outer coding and error protection. The document references ETSI EN 300 401 for information on the DAB radio broadcasting system.
This document provides information about an upcoming course on space power systems hosted by the Applied Technology Institute. The 5-day course will cover topics such as orbital mechanics, spacecraft propulsion, flight mechanics, attitude determination and control, structural design, and space power systems. It will be taught by experts in the field and provide attendees with a complete set of course notes and the textbook "Space Systems".
Mobile data traffic is growing year to year. Mobile operators are facing a different situation from voice legacy business. The growth of data traffic is not as high as one of revenue. They need to lower cost of Mbps to survive otherwise they will collapse.
This document provides an introduction to Reed-Solomon codes, which are word-oriented, non-binary BCH codes that are simple, robust, and perform well for burst errors. Reed-Solomon codes use Galois field techniques to encode data into blocks of length 2^m - 1 by adding 2t parity check words, allowing the correction of t errors. The encoding and decoding procedures make use of a generator polynomial, Berlekamp-Massey algorithm, Chien search, and Forney algorithm. Future work may include more flexible generator polynomials or converting C54x codes to C55x codes.
The document compares video compression standards MPEG-4 and H.264. It discusses key aspects of each including profiles, levels, uses and future applications. MPEG-4 introduced object-based coding while H.264 provides around 50% better compression than MPEG-4 at similar quality levels. Both standards are widely used for video streaming, television broadcasting, and storage applications like Blu-ray discs. Ongoing development aims to improve support for high definition video formats.
This document compares video compression standards MPEG-4 and H.264. It provides an overview of both standards, including their development histories and profiles. MPEG-4 was the first standard to support object-based video coding and compression of different media types. H.264 provides significantly better compression than prior standards like MPEG-2 at the cost of higher computational complexity. Both standards are widely used today for applications ranging from mobile and internet video to television broadcasting and digital cinema.
The document discusses video streaming and video communication applications. It outlines different types of video applications including video storage, videoconferencing, digital TV, and video streaming over the internet. It then describes properties of video communication applications such as broadcast, multicast, point-to-point, real-time encoding, static or dynamic channels, and quality of service support. Finally, it discusses variable bitrate versus constant bitrate coding and how bit allocation affects quality.
This document summarizes key techniques used in video compression codecs. It discusses still image compression techniques like block transforms, quantization, and variable length coding that video codecs build upon. It then covers motion estimation and compensation, which take advantage of similarities between frames to greatly improve compression ratio. The document outlines processing requirements for techniques like block transforms, motion estimation, and motion compensation, noting they require substantial compute resources and memory bandwidth.
This document compares video compression standards MPEG-4 and H.264. It discusses key factors for video compression like spatial and temporal sampling. It provides an overview of MPEG-4 including object-based coding, profiles and levels. H.264 is introduced as a standard that provides 50% bit rate savings over MPEG-2. Profiles and levels are explained for both standards. Common uses of each are listed, along with future development options.
The document discusses different types of video compression standards including MPEG, H.261, H.263, and JPEG. It explains key concepts in video compression like frame rate, color resolution, spatial resolution, and image quality. MPEG standards like MPEG-1, MPEG-2, MPEG-4, and MPEG-7 are defined for compressing video and audio at different bit rates. Techniques like spatial and temporal redundancy reduction are used to compress video frames and consecutive frames. Compression reduces file sizes but can cause data loss during transmission.
This document discusses various topics related to data compression including compression techniques, audio compression, video compression, and standards like MPEG and JPEG. It covers lossless versus lossy compression, explaining that lossy compression can achieve much higher levels of compression but results in some loss of quality, while lossless compression maintains the original quality. The advantages of data compression include reducing file sizes, saving storage space and bandwidth.
Video encoding uses various techniques to compress video files in a lossy manner. It involves representing color information using RGB or YCbCr color spaces, sampling and quantizing signals to convert them to digital form, using the Fourier transform to analyze signal frequencies, windowing to divide signals for transform analysis, inter-frame encoding to remove redundancy between frames, and intra-frame encoding to remove redundancy within frames. Key compression techniques include motion compensation between inter-coded frames and periodic insertion of intra-coded frames.
H.264, also known as MPEG-4 Part 10 or AVC, is a video compression standard that provides significantly better compression than previous standards such as MPEG-2. It achieves this through spatial and temporal redundancy reduction techniques including intra-frame prediction, inter-frame prediction, and entropy coding. Motion estimation, which finds motion vectors between frames to enable inter-frame prediction, is the most computationally intensive part of H.264 encoding. Previous GPU implementations of H.264 motion estimation have sacrificed quality for parallelism or have not fully addressed dependencies between blocks. This document proposes a pyramid motion estimation approach on GPU that can better address dependencies while maintaining quality.
1. The document discusses video compression technology, including digital television formats, video compression standards like MPEG-2 and H.264, video quality metrics, and video coding concepts.
2. Key video coding concepts covered are temporal compression using motion estimation and compensation between frames, spatial compression within frames using DCT transform and quantization, and entropy coding of coefficients.
3. Video compression aims to reduce the data required for transmission by removing spatial and temporal redundancy in video sequences.
Compression: Video Compression (MPEG and others)danishrafiq
This document provides an overview of video compression techniques used in standards like MPEG and H.261. It discusses how uncompressed video data requires huge storage and bandwidth that compression aims to address. It explains that lossy compression methods are needed to achieve sufficient compression ratios. The key techniques discussed are intra-frame coding using DCT and quantization similar to JPEG, and inter-frame coding using motion estimation and compensation to remove temporal redundancy between frames. Motion vectors are found using techniques like block matching and sum of absolute differences. MPEG and other standards use a combination of these intra and inter-frame coding techniques to efficiently compress video for storage and transmission.
This document proposes a method for preserving privacy in video surveillance by scrambling regions of interest (ROIs) in video sequences. It discusses scrambling quantized DCT or DWT coefficients in compressed video to conceal information in ROIs while maintaining understanding of the overall scene. The scrambling is flexible and reversible with a private key, has low computational complexity, and introduces minimal impact on video coding performance. Previous approaches are also summarized.
Introduction to Video Compression Techniques - Anurag JainVideoguy
The document provides an overview of video compression techniques and standards. It discusses the motivation for video compression to reduce data sizes for storage and transmission. It then reviews several key video compression standards including H.261, H.263, MPEG-1, MPEG-2, MPEG-4, H.264 and others. For each standard, it summarizes the goals, features, applications and technical details like motion compensation methods, block sizes, and bitrate ranges.
The document discusses 3D graphics compression standards. It provides an overview of MPEG's work in developing standards for compressing 3D graphics content, similar to how other standards compress video and audio. This includes MPEG-4's initial work with surfaces like Indexed Face Sets as well as later efforts involving patches and subdivision surfaces to improve compression ratios and representation of curved surfaces. The goal is to standardize a format for compressed 3D graphics to enable widespread use in applications.
MPEG is a video compression standard developed in the late 1980s to enable full-motion video over networks and storage mediums. It was created by the Motion Picture Experts Group to address the need for high compression ratios to transmit video given bandwidth limitations of the time. MPEG uses spatial and temporal redundancy reduction techniques like discrete cosine transformation, quantization, and entropy coding to compress video frames and take advantage of similarities between neighboring pixels and successive frames. It defines a group of pictures structure and different frame types like I, P, and B frames to enable features like random access while maintaining synchronization and error robustness. MPEG became widely adopted and evolved through standards like MPEG-1, MPEG-2, and MPEG
This document discusses digital video codecs and compression. It begins by defining pixel resolutions for standard definition, high definition, and digital cinema. It then covers CMOS image sensors used for HD, 2K and 4K capture and explains intra-frame and inter-frame compression. The document provides an example of the Apple ProRes 422 codec and analyzes its key attributes. It also discusses interlaced vs progressive scanning, picture impairments from compression, digital cinema standards, and predicts that requirements on compression will reduce over time due to technological advances.
Video is recorded as a sequence of images called frames. A minimum of 16 frames per second is needed for smooth motion. Digital video requires large storage due to uncompressed size. Various techniques are used for compression, including filtering, downscaling resolution and frame rate, transforming frames to the frequency domain, quantizing coefficients, and interframe compression by storing differences between frames. MPEG standards use intraframe and interframe compression along with a system layer to form a single stream for storage and transmission of video and audio. MPEG-1 achieves around 1.5 Mbps for video CDs while MPEG-2 is used for digital TV at higher bitrates. Streaming video adapts to varying network speeds by buffering and dropping frames
AI research is enabling more efficient video and voice codecs through techniques like generative models and deep learning. Qualcomm's latest research includes a neural video codec that achieves state-of-the-art compression rates compared to other learned video compression solutions. Their work on B-frame coding also provides improved rate-distortion results by extending neural P-frame codecs to allow for B-frame coding and interpolation. Future research aims to develop more efficient on-device deployment methods and semantically aware compression focused on regions of interest.
This document provides an overview of various video compression techniques and standards. It discusses fundamentals of digital video including frame rate, color resolution, spatial resolution, and image quality. It describes different compression techniques like intraframe, interframe, and lossy vs lossless. Key video compression standards discussed include MPEG-1, MPEG-2, MPEG-4, H.261, H.263 and JPEG for still image compression. Factors that impact compression like compression ratio, bit rate control, and real-time vs non-real-time are also summarized.
The document provides an overview of MPEG-4, a standard that offers both advanced audio and video codecs as well as tools for combining multimedia such as audio, video, graphics and interactivity. It was developed through an open international process to select the best technologies. MPEG-4 codecs like AVC and AAC provide high compression efficiency, having been adopted for HDTV, mobile video, and digital music. Its rich media tools allow interactive experiences combining different media types.
This document provides an overview of Codan's 6700/6900 series block up converter (BUC) systems and components. It describes the BUC, low-noise block converter (LNB), and redundancy systems. It also covers installation, operation, and troubleshooting of the systems. The document contains information on frequency bands, conversion plans, interfaces, cable connections, monitor/control, commands, maintenance procedures, and compliance standards.
This document discusses digital set-top boxes (STBs) and related standards. It covers:
1) The DVB standards for digital TV broadcasting via different transmission media, including DVB-T for terrestrial, DVB-S for satellite, and DVB-C for cable. These share source coding/compression and service multiplexing standards.
2) STBs will be needed until integrated digital TVs are cheaper. Affordable STBs are key for digital TV adoption. Common standards help lower STB costs through economies of scale.
3) "Open architecture" and "interoperability" mean the STB functionality is defined by public standards and can receive services across networks, respectively. The
The document discusses DCT/IDCT concepts and applications. It provides an introduction to DCT and IDCT, explaining that they are used widely in video and audio compression. It describes the DCT and IDCT functions and how they work to transform signals between spatial and frequency domains. Examples of one-dimensional and two-dimensional DCT/IDCT equations are also given. Finally, common applications of DCT/IDCT compression techniques are listed, such as in DVD players, cable TV, graphics cards, and medical imaging systems.
This document discusses image compression using the discrete cosine transform (DCT). It develops simple Mathematica functions to compute the 1D and 2D DCT. The 1D DCT transforms a list of real numbers into elementary frequency components. It is computed via matrix multiplication or using the discrete Fourier transform with twiddle factors. The 2D DCT applies the 1D DCT to rows and then columns of an image, making it separable. These functions illustrate how Mathematica can be used to prototype image processing algorithms.
DVB-S2 is the second-generation specification for satellite broadcasting developed by DVB in 2003. It uses more advanced channel coding (LDPC codes) and modulation formats (QPSK, 8PSK, 16APSK, 32APSK) for a 30% increase in transmission capacity over DVB-S. DVB-S2 allows for adaptive coding and modulation to optimize transmission for each user. It is designed for broadcast, interactive, and professional applications with flexibility to handle different transponder characteristics and content formats.
The STi7167 is an integrated system-on-chip that combines a configurable DVB-T or DVB-C demodulator with STB decoding and display functions. It provides advanced HD and SD video decoding, audio decoding, graphics processing, and connectivity options. The chip's integrated features allow for low cost and small size STB designs for cable or terrestrial networks.
This document provides an overview of service information (SI) in digital video broadcasting (DVB) systems, including sections like the network information section (NIT), service description section (SDT), bouquet association section (BAT), program association section (PAT), conditional access section (CAT), transport stream description section (TSDT), event information section (EIT), and running status section (RST). It includes syntax diagrams and details for each section, such as table IDs, section lengths, descriptors, and other fields. It also provides the PID and refresh interval requirements for each table type.
1) The document describes a modification to the Huffman coding used in JPEG image compression. It proposes pairing each non-zero DCT coefficient with the run-length of subsequent (rather than preceding) zero coefficients.
2) This allows using separate optimized Huffman code tables for each DCT coefficient position, improving compression by 10-15% over standard JPEG coding.
3) The decoding procedure is not changed and no end-of-block marker is needed, providing advantages with no increase in complexity.
Dani Pedrosa won the MotoGP race at Laguna Seca, finishing just 0.344 seconds ahead of Valentino Rossi in second and 1.926 seconds ahead of Jorge Lorenzo in third. Casey Stoner finished fourth, over 12 seconds behind Pedrosa. There were several crashes during the race, with Andrea Dovizioso, Sete Gibernau, and Gabor Talmacsi all falling out of contention. James Toseland received a ride through penalty for a jump start.
The document provides implementation guidelines for using the DVB Simulcrypt standard, including describing the architecture and protocols, clarifying differences between protocol versions, explaining state diagrams and behaviors, and providing recommendations for error handling, redundancy management, and custom signaling profiles to facilitate reliable and efficient Simulcrypt headend implementation.
1) The document discusses quantization and pulse code modulation (PCM) in voice signal encoding. PCM assigns 256 possible values to digitally represent analog voice samples, divided into chords and steps on a linear scale.
2) A logarithmic quantization scale is better than a linear one for voice signals, as it allocates more quantization steps to lower amplitudes prevalent in speech. This "compressed encoding" improves fidelity.
3) Quantization error occurs when samples with different amplitudes are assigned the same digital value, distorting the reconstructed waveform. Compression helps maintain a higher signal-to-noise ratio especially for low amplitudes.
This document provides implementation guidelines for the DVB Simulcrypt standard. It describes the architecture and protocols involved in simulcrypt systems, including the ECMG protocol between the security client system and conditional access modules, and the EMMG/PDG protocol between conditional access modules and multiplex equipment. The document outlines differences between version 1 and 2 of the standards, and provides recommendations for compliance. It also includes detailed state diagrams and descriptions of the protocols involved.
The Event Logger monitors and logs Digital Program Insertion (DPI) messages to verify correct transmission of signals via satellite. It watches for configured GPI state changes that indicate an expected DPI message. If the message is received on time, it is logged as a matched event. If not received on time, it is flagged as missed. The Event Logger also decodes DPI messages to help diagnose issues, and is compatible with various encoding systems. It has 6 ASI inputs, 108 GPI sensors, and logs data in real-time and for archiving.
This document discusses the basics of BISS scrambling. It describes BISS mode 1, which uses a session word, and BISS mode E, which encrypts the session word using an identifier and encryption algorithm. BISS mode E provides an additional layer of protection for transmitting the session word. The document also covers calculating the encrypted session word, using buried and injected identifiers, and how to operate scramblers in the different BISS modes.
The document discusses quantization in analog-to-digital conversion. It describes the three processes of A/D conversion as sampling, quantization, and binary encoding. Quantization involves mapping amplitude values into a set of discrete values using a quantization interval or step size. The document discusses uniform quantization and how the quantization levels are determined. It also covers non-uniform quantization and provides examples and MATLAB code demonstrations of audio signal quantization.
1) Reed-Solomon codes are a type of error-correcting code invented in 1960 that can detect and correct multiple symbol errors. They work by encoding data into redundant symbols that can be used to detect and locate errors.
2) Reed-Solomon codes are particularly good at correcting burst errors, where a block of symbols are corrupted together by noise. Even if an entire block of bits is corrupted, the code can still correct the errors by replacing the corrupted symbol.
3) The error correction capability of Reed-Solomon codes increases with larger block sizes, as noise is averaged over more symbols. However, implementing Reed-Solomon codes also becomes more complex with higher redundancy.
This document describes the head-end architecture and synchronization for digital video broadcasting using SimulCrypt. It outlines the system components including an event information scheduler, SimulCrypt synchronizer, entitlement control message generator, entitlement management message generator, and multiplexer. It also describes the interfaces between these components, covering processes like channel and stream establishment and closure, as well as bandwidth allocation and status reporting.
This document provides the European standard for the frame structure, channel coding and modulation for a second generation digital transmission system for cable systems (DVB-C2). It defines the system architecture and specifications for input processing, bit-interleaved coding and modulation, data slice packet generation, layer 1 part 2 signalling, frame building, and OFDM generation. The standard aims to provide improved performance for cable systems over the existing DVB-C standard.
This document discusses Euler's formula, which relates the number of vertices (V), edges (E), and faces (P) of a polyhedron. Through experimenting with attaching polygons and bending shapes, students derive the formula V - E + P = 2 for polyhedra. Removing a face shows the formula still holds, revealing why it is true for any polyhedron. Students learn the formula can distinguish polyhedra from other 3D shapes by calculating the Euler characteristic V - E + P.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Things to Consider When Choosing a Website Developer for your Website | FODUUFODUU
Choosing the right website developer is crucial for your business. This article covers essential factors to consider, including experience, portfolio, technical skills, communication, pricing, reputation & reviews, cost and budget considerations and post-launch support. Make an informed decision to ensure your website meets your business goals.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Presentation of the OECD Artificial Intelligence Review of Germany
video_compression_2004
1. Video
Coding
Video Compression
MIT 6.344, Spring 2004
John G. Apostolopoulos
Streaming Media Systems Group
Hewlett-Packard Laboratories
japos@hpl.hp.com
John G. Apostolopoulos
April 22, 2004 Page 1
2. Video
Coding
Overview of Next Three Lectures
Today • Video Compression (Thurs, 4/22)
– Principles and practice of video coding
– Basics behind MPEG compression algorithms
– Current image & video compression standards
• Video Communication & Video Streaming I (Tues, 4/27)
– Video application contexts & examples: DVD and Digital TV
– Challenges in video streaming over the Internet
– Techniques for overcoming these challenges
• Video Communication & Video Streaming II (Thurs, 4/29)
– Video over lossy packet networks and wireless links → Error-
resilient video communications
John G. Apostolopoulos
April 22, 2004 Page 2
3. Video
Coding
Outline of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 3
4. Video Motivation for Compression:
Coding
Example of HDTV Video Signal
• Problem:
– Raw video contains an immense amount of data
– Communication and storage capabilities are limited
and expensive
• Example HDTV video signal:
– 720x1280 pixels/frame, progressive scanning at
60 frames/s:
⎛ 720 × 1280 pixels ⎞⎛ 60 frames ⎞⎛ 3colors ⎞⎛ 8bits ⎞
⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟ = 1.3Gb / s
⎝ frame ⎠⎝ sec ⎠⎝ pixel ⎠⎝ color ⎠
– 20 Mb/s HDTV channel bandwidth
→ Requires compression by a factor of 70 (equivalent
to .35 bits/pixel)
John G. Apostolopoulos
April 22, 2004 Page 4
5. Video
Coding
Achieving Compression
• Reduce redundancy and irrelevancy
• Sources of redundancy
– Temporal: Adjacent frames highly correlated
– Spatial: Nearby pixels are often correlated with
each other
– Color space: RGB components are correlated
among themselves
→ Relatively straightforward to exploit
• Irrelevancy
– Perceptually unimportant information
→ Difficult to model and exploit
John G. Apostolopoulos
April 22, 2004 Page 5
6. Video Spatial and Temporal Redundancy
Coding
• Why can video be compressed?
– Video contains much spatial and temporal redundancy.
• Spatial redundancy: Neighboring pixels are similar
• Temporal redundancy: Adjacent frames are similar
Compression is achieved by exploiting the spatial and temporal
redundancy inherent to video.
John G. Apostolopoulos
April 22, 2004 Page 6
7. Video
Coding
Outline of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 7
8. Video
Coding
Generic Compression System
Original Compressed
Signal Representation Binary Bitstream
Quantization
(Analysis) Encoding
A compression system is composed of three key building blocks:
• Representation
– Concentrates important information into a few parameters
• Quantization
– Discretizes parameters
• Binary encoding
– Exploits non-uniform statistics of quantized parameters
– Creates bitstream for transmission
John G. Apostolopoulos
April 22, 2004 Page 8
9. Video
Coding
Generic Compression System (cont.)
Original Compressed
Signal Representation Binary Bitstream
Quantization
(Analysis) Encoding
Generally Lossy Lossless
lossless
• Generally, the only operation that is lossy is the
quantization stage
• The fact that all the loss (distortion) is localized to a
single operation greatly simplifies system design
• Can design loss to exploit human visual system (HVS)
properties
John G. Apostolopoulos
April 22, 2004 Page 9
10. Video
Coding
Generic Compression System (cont.)
Original Compressed
Signal Bitstream
Representation Binary
Quantization
(Analysis) Encoding
Source Encoder Channel
Reconstructed
Signal
Representation Inverse Binary
(Synthesis) Quantization Decoding
Source Decoder
• Source decoder performs the inverse of each of the three
operations
John G. Apostolopoulos
April 22, 2004 Page 10
11. Video
Coding
Review of Image Compression
Original Compressed
Image RGB Runlength & Bitstream
to Block DCT Quantization Huffman
YUV Coding
• Coding an image (single frame):
– RGB to YUV color-space conversion
– Partition image into 8x8-pixel blocks
– 2-D DCT of each block
– Quantize each DCT coefficient
– Runlength and Huffman code the nonzero quantized DCT
coefficients
→ Basis for the JPEG Image Compression Standard
→ JPEG-2000 uses wavelet transform and arithmetic coding
John G. Apostolopoulos
April 22, 2004 Page 11
12. Video
Coding
Outline of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 12
13. Video
Coding
Video Compression
• Video: Sequence of frames (images) that are related
• Related along the temporal dimension
– Therefore temporal redundancy exists
• Main addition over image compression
– Temporal redundancy
→ Video coder must exploit the temporal redundancy
John G. Apostolopoulos
April 22, 2004 Page 13
14. Video
Coding
Temporal Processing
• Usually high frame rate: Significant temporal redundancy
• Possible representations along temporal dimension:
– Transform/subband methods
– Good for textbook case of constant velocity uniform
global motion
– Inefficient for nonuniform motion, I.e. real-world motion
– Requires large number of frame stores
– Leads to delay (Memory cost may also be an issue)
– Predictive methods
– Good performance using only 2 frame stores
– However, simple frame differencing in not enough…
John G. Apostolopoulos
April 22, 2004 Page 14
15. Video Video Compression
Coding
• Goal: Exploit the temporal redundancy
• Predict current frame based on previously coded frames
• Three types of coded frames:
– I-frame: Intra-coded frame, coded independently of all
other frames
– P-frame: Predictively coded frame, coded based on
previously coded frame
– B-frame: Bi-directionally predicted frame, coded based
on both previous and future coded frames
I frame P-frame B-frame
John G. Apostolopoulos
April 22, 2004 Page 15
16. Video Temporal Processing:
Coding
Motion-Compensated Prediction
• Simple frame differencing fails when there is motion
• Must account for motion
→ Motion-compensated (MC) prediction
• MC-prediction generally provides significant improvements
• Questions:
– How can we estimate motion?
– How can we form MC-prediction?
John G. Apostolopoulos
April 22, 2004 Page 16
17. Video Temporal Processing:
Coding
Motion Estimation
• Ideal situation:
– Partition video into moving objects
– Describe object motion
→ Generally very difficult
• Practical approach: Block-Matching Motion Estimation
– Partition each frame into blocks, e.g. 16x16 pixels
– Describe motion of each block
→ No object identification required
→ Good, robust performance
John G. Apostolopoulos
April 22, 2004 Page 17
18. Video Block-Matching Motion Estimation
Coding
4
3 4 3
2 2
1 7 8 1 8
7
6 6
5 12 5 12
11 11
10 10
Motion Vector 9 16 9 16
14 15 15
(mv1, mv2) 13 13
14
Reference Frame Current Frame
• Assumptions:
– Translational motion within block:
f (n1 , n2 , kcur ) = f (n1 − mv1 , n2 − mv2 , k ref )
– All pixels within each block have the same motion
• ME Algorithm:
1) Divide current frame into non-overlapping N1xN2 blocks
2) For each block, find the best matching block in reference frame
• MC-Prediction Algorithm:
– Use best matching blocks of reference frame as prediction of
blocks in current frame
John G. Apostolopoulos
April 22, 2004 Page 18
19. Video Block Matching:
Coding
Determining the Best Matching Block
• For each block in the current frame search for best matching
block in the reference frame
– Metrics for determining “best match”:
MSE = ∑ ∑ [ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref )]2
( n1 ,n2 )∈Block
MAE = ∑ ∑ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref )
( n1 ,n2 )∈Block
– Candidate blocks: All blocks in, e.g., (± 32,±32) pixel area
– Strategies for searching candidate blocks for best match
– Full search: Examine all candidate blocks
– Partial (fast) search: Examine a carefully selected subset
• Estimate of motion for best matching block: “motion vector”
John G. Apostolopoulos
April 22, 2004 Page 19
20. Video
Coding
Motion Vectors and Motion Vector Field
• Motion vector
– Expresses the relative horizontal and vertical offsets
(mv1,mv2), or motion, of a given block from one
frame to another
– Each block has its own motion vector
• Motion vector field
– Collection of motion vectors for all the blocks in a
frame
John G. Apostolopoulos
April 22, 2004 Page 20
21. Video Example of Fast Motion Estimation Search:
Coding
3-Step (Log) Search
• Goal: Reduce number of search
points
• Example: (± 7,±7 ) search area
• Dots represent search points
• Search performed in 3 steps
(coarse-to-fine):
Step 1: (± 4 pixels )
Step 2: (± 2 pixels )
Step 3: (± 1 pixels )
• Best match is found at each step
• Next step: Search is centered
around the best match of prior step
• Speedup increases for larger
search areas
John G. Apostolopoulos
April 22, 2004 Page 21
22. Video Motion Vector Precision?
Coding
• Motivation:
– Motion is not limited to integer-pixel offsets
– However, video only known at discrete pixel locations
– To estimate sub-pixel motion, frames must be spatially
interpolated
• Fractional MVs are used to represent the sub-pixel motion
• Improved performance (extra complexity is worthwhile)
• Half-pixel ME used in most standards: MPEG-1/2/4
• Why are half-pixel motion vectors better?
– Can capture half-pixel motion
– Averaging effect (from spatial interpolation) reduces
prediction error → Improved prediction
– For noisy sequences, averaging effect reduces noise →
Improved compression
John G. Apostolopoulos
April 22, 2004 Page 22
23. Video Practical Half-Pixel Motion Estimation
Coding
Algorithm
• Half-pixel ME (coarse-fine) algorithm:
1) Coarse step: Perform integer motion estimation on blocks; find
best integer-pixel MV
2) Fine step: Refine estimate to find best half-pixel MV
a) Spatially interpolate the selected region in reference frame
b) Compare current block to interpolated reference frame
block
c) Choose the integer or half-pixel offset that provides best
match
• Typically, bilinear interpolation is used for spatial interpolation
John G. Apostolopoulos
April 22, 2004 Page 23
24. Video Example: MC-Prediction for Two
Coding
Consecutive Frames
Previous Frame Current Frame
(Reference Frame) (To be Predicted)
4
3 4 3
2 2
1 7 8 1 8
7
6 6
5 12 5 12
11 11
10 16 10
9 15 9 16
14 15
14
13 13
Reference Frame Predicted Frame John G. Apostolopoulos
April 22, 2004 Page 24
25. Video Example: MC-Prediction for Two
Coding
Consecutive Frames (cont.)
Prediction of
Current Frame
Prediction Error
(Residual)
John G. Apostolopoulos
April 22, 2004 Page 25
26. Video
Coding
Block Matching Algorithm: Summary
• Issues:
– Block size?
– Search range?
– Motion vector accuracy?
• Motion typically estimated only from luminance
• Advantages:
– Good, robust performance for compression
– Resulting motion vector field is easy to represent (one MV
per block) and useful for compression
– Simple, periodic structure, easy VLSI implementations
• Disadvantages:
– Assumes translational motion model → Breaks down for
more complex motion
– Often produces blocking artifacts (OK for coding with
Block DCT)
John G. Apostolopoulos
April 22, 2004 Page 26
27. Video Bi-Directional MC-Prediction
Coding
4 4
3 4 3
2 2 3
1 7 8 1 8 1 2 7 8
7
6 6 6
5 12 5 12 5 11 12
11 11
10 16 10 10
9 15 9 16 9 15 16
14 15 14
14 13
13 13
Previous Frame Current Frame Future Frame
• Bi-Directional MC-Prediction is used to estimate a block in the
current frame from a block in:
1) Previous frame
2) Future frame
3) Average of a block from the previous frame and a block
from the future frame
4) Neither, i.e. code current block without prediction
John G. Apostolopoulos
April 22, 2004 Page 27
28. Video MC-Prediction and Bi-Directional
Coding
MC-Prediction (P- and B-frames)
• Motion compensated prediction: Predict the current frame
based on reference frame(s) while compensating for the motion
• Examples of block-based motion-compensated prediction
(P-frame) and bi-directional prediction (B-frame):
4 4 4
3 4 3 3 4 3
2 2 2 2 3
1 7 8 1 8 1 7 8 1 8 1 2 7 8
7 7
6 6 6 6 6
12 5 12 5 12 5 12 5 11 12
5 11 11 11 11
10 10 10 16 10 10
16 9
9 15 9 16 9
14 15 9
15
16 15 16
14 15 14
14 14 13
13 13 13 13
Previous Frame P-Frame Previous Frame B-Frame Future Frame
John G. Apostolopoulos
April 22, 2004 Page 28
29. Video Video Compression
Coding
• Main addition over image compression:
– Exploit the temporal redundancy
• Predict current frame based on previously coded frames
• Three types of coded frames:
– I-frame: Intra-coded frame, coded independently of all
other frames
– P-frame: Predictively coded frame, coded based on
previously coded frame
– B-frame: Bi-directionally predicted frame, coded based
on both previous and future coded frames
I frame P-frame B-frame
John G. Apostolopoulos
April 22, 2004 Page 29
30. Video Example Use of I-,P-,B-frames:
Coding
MPEG Group of Pictures (GOP)
• Arrows show prediction dependencies between frames
I0 B1 B2 P3 B4 B5 P6 B7 B8 I9
MPEG GOP
John G. Apostolopoulos
April 22, 2004 Page 30
31. Video
Coding
Summary of Temporal Processing
• Use MC-prediction (P and B frames) to reduce temporal
redundancy
• MC-prediction usually performs well; In compression have a
second chance to recover when it performs badly
• MC-prediction yields:
– Motion vectors
– MC-prediction error or residual → Code error with
conventional image coder
• Sometimes MC-prediction may perform badly
– Examples: Complex motion, new imagery (occlusions)
– Approach:
1. Identify frame or individual blocks where prediction fails
2. Code without prediction
John G. Apostolopoulos
April 22, 2004 Page 31
32. Video
Coding
Basic Video Compression Architecture
• Exploiting the redundancies:
– Temporal: MC-prediction (P and B frames)
– Spatial: Block DCT
– Color: Color space conversion
• Scalar quantization of DCT coefficients
• Zigzag scanning, runlength and Huffman coding of the
nonzero quantized DCT coefficients
John G. Apostolopoulos
April 22, 2004 Page 32
33. Video Example Video Encoder
Coding
Input Buffer fullness
Video Residual
Signal RGB
Huffman
to DCT Quantize Buffer
Coding
YUV Output
Bitstream
Inverse
Quantize MV data
Inverse
DCT
MC-Prediction
Motion Frame Store
Compensation
Previous
MV data Reconstructed
Frame
Motion
Estimation
John G. Apostolopoulos
April 22, 2004 Page 33
34. Video
Coding
Example Video Decoder
Reconstructed
Residual Frame
Huffman Inverse Inverse
Buffer YUV to RGB
Decoder Quantize DCT
Input Output
Bitstream Video
Signal
MC-Prediction Frame Store
Previous
MV data Motion Reconstructed
Compensation Frame
John G. Apostolopoulos
April 22, 2004 Page 34
35. Video
Coding
Outline of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 35
36. Video Motivation for Scalable Coding
Coding
Basic situation:
1. Diverse receivers may request the same video
– Different bandwidths, spatial resolutions, frame rates,
computational capabilities
2. Heterogeneous networks and a priori unknown network conditions
– Wired and wireless links, time-varying bandwidths
→ When you originally code the video you don’t know which client
or network situation will exist in the future
→ Probably have multiple different situations, each requiring a
different compressed bitstream
→ Need a different compressed video matched to each situation
• Possible solutions:
1. Compress & store MANY different versions of the same video
2. Real-time transcoding (e.g. decode/re-encode)
3. Scalable coding
John G. Apostolopoulos
April 22, 2004 Page 36
37. Video
Coding
Scalable Video Coding
• Scalable coding:
– Decompose video into multiple layers of prioritized
importance
– Code layers into base and enhancement bitstreams
– Progressively combine one or more bitstreams to produce
different levels of video quality
• Example of scalable coding with base and two enhancement
layers: Can produce three different qualities
1. Base layer
2. Base + Enh1 layers Higher quality
3. Base + Enh1 + Enh2 layers
• Scalability with respect to: Spatial or temporal resolution, bit
rate, computation, memory
John G. Apostolopoulos
April 22, 2004 Page 37
38. Video
Coding
Example of Scalable Coding
• Encode image/video into three layers:
Base Enh1 Enh2
Encoder
• Low-bandwidth receiver: Send only Base layer
Base
Decoder Low Res
• Medium-bandwidth receiver: Send Base & Enh1 layers
Base Enh1
Decoder Med Res
• High-bandwidth receiver: Send all three layers
Base Enh1 Enh2
Decoder High Res
• Can adapt to different clients and network situations John G. Apostolopoulos
April 22, 2004 Page 38
39. Video
Coding
Scalable Video Coding (cont.)
• Three basic types of scalability (refine video quality
along three different dimensions):
– Temporal scalability → Temporal resolution
– Spatial scalability → Spatial resolution
– SNR (quality) scalability → Amplitude resolution
• Each type of scalable coding provides scalability of one
dimension of the video signal
– Can combine multiple types of scalability to provide
scalability along multiple dimensions
John G. Apostolopoulos
April 22, 2004 Page 39
40. Video
Coding
Scalable Coding: Temporal Scalability
• Temporal scalability: Based on the use of B-frames to
refine the temporal resolution
– B-frames are dependent on other frames
– However, no other frame depends on a B-frame
– Each B-frame may be discarded without affecting
other frames
I0 B1 B2 P3 B4 B5 P6 B7 B8 I9
MPEG GOP John G. Apostolopoulos
April 22, 2004 Page 40
41. Video
Coding
Scalable Coding: Spatial Scalability
• Spatial scalability: Based on refining the spatial resolution
– Base layer is low resolution version of video
– Enh1 contains coded difference between upsampled
base layer and original video
– Also called: Pyramid coding
Enh layer
Enc Dec
↓2 ↑2 ↑2 High-Res
Original Video
Video Dec
Base layer Low-Res
Enc Dec Video
John G. Apostolopoulos
April 22, 2004 Page 41
42. Video Scalable Coding: SNR (Quality)
Coding
Scalability
• SNR (Quality) Scalability: Based on refining the
amplitude resolution
– Base layer uses a coarse quantizer
– Enh1 applies a finer quantizer to the difference
between the original DCT coefficients and the
coarsely quantized base layer coefficients
EP frame
EI frame
Note: Base & enhancement
layers are at the same spatial
I frame P-frame resolution
John G. Apostolopoulos
April 22, 2004 Page 42
43. Video
Coding
Summary of Scalable Video Coding
• Three basic types of scalable video coding:
– Temporal scalability
– Spatial scalability
– SNR (quality) scalability
• Scalable coding produces different layers with prioritized
importance
• Prioritized importance is key for a variety of applications:
– Adapting to different bandwidths, or client resources
such as spatial or temporal resolution or computational
power
– Facilitates error-resilience by explicitly identifying most
important and less important bits
John G. Apostolopoulos
April 22, 2004 Page 43
44. Video
Coding
Outline of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 44
45. Video
Coding
Motivation for Standards
• Goal of standards:
– Ensuring interoperability: Enabling communication
between devices made by different manufacturers
– Promoting a technology or industry
– Reducing costs
John G. Apostolopoulos
April 22, 2004 Page 45
46. Video
Coding
What do the Standards Specify?
Encoder Bitstream Decoder
John G. Apostolopoulos
April 22, 2004 Page 46
47. Video
Coding
What do the Standards Specify?
Encoder Bitstream Decoder
(Decoding
Process)
• Not the encoder Scope of Standardization
• Not the decoder
• Just the bitstream syntax and the decoding process (e.g. use IDCT,
but not how to implement the IDCT)
→ Enables improved encoding & decoding strategies to be
employed in a standard-compatible manner
John G. Apostolopoulos
April 22, 2004 Page 47
48. Video Current Image and Video
Coding
Compression Standards
Standard Application Bit Rate
JPEG Continuous-tone still-image Variable
compression
H.261 Video telephony and p x 64 kb/s
teleconferencing over ISDN
MPEG-1 Video on digital storage media 1.5 Mb/s
(CD-ROM)
MPEG-2 Digital Television 2-20 Mb/s
H.263 Video telephony over PSTN 33.6-? kb/s
MPEG-4 Object-based coding, synthetic Variable
content, interactivity
JPEG-2000 Improved still image compression Variable
H.264 / Improved video compression 10’s to 100’s kb/s
MPEG-4 AVC
John G. Apostolopoulos
April 22, 2004 Page 48
49. Video Comparing Current Video Compression
Coding
Standards
• Based on the same fundamental building blocks
– Motion-compensated prediction (I, P, and B frames)
– 2-D Discrete Cosine Transform (DCT)
– Color space conversion
– Scalar quantization, runlengths, Huffman coding
• Additional tools added for different applications:
– Progressive or interlaced video
– Improved compression, error resilience, scalability, etc.
• MPEG-1/2/4, H.261/3/4: Frame-based coding
• MPEG-4: Object-based coding and Synthetic video
John G. Apostolopoulos
April 22, 2004 Page 49
50. Video MPEG Group of Pictures (GOP)
Coding
Structure
• Composed of I, P, and B frames
• Arrows show prediction dependencies
• Periodic I-frames enable random access into the coded bitstream
• Parameters: (1) Spacing between I frames, (2) number of B frames
between I and P frames
I0 B1 B2 P3 B4 B5 P6 B7 B8 I9
MPEG GOP John G. Apostolopoulos
April 22, 2004 Page 50
51. Video
Coding
MPEG Structure
• MPEG codes video in a hierarchy of layers. The
sequence layer is not shown.
GOP Layer Picture Layer
P
B
B
P
B
B 4 8x8 DCT
I 1 MV 8x8 DCT
Block
Macroblock Layer
Slice Layer
Layer
John G. Apostolopoulos
April 22, 2004 Page 51
52. Video
Coding
MPEG-2 Profiles and Levels
• Goal: To enable more efficient implementations for
different applications (interoperability points)
– Profile: Subset of the tools applicable for a family of
applications
– Level: Bounds on the complexity for any profile
Level
HDTV: Main Profile at
High High Level (MP@HL)
Main DVD & SD Digital TV:
Main Profile at Main Level
Low (MP@ML)
Profile
Simple Main High
John G. Apostolopoulos
April 22, 2004 Page 52
53. Video
Coding
MPEG-4 Natural Video Coding
• Extension of MPEG-1/2-type algorithms to code
arbitrarily shaped objects
Frame-based Coding
Object-based Coding [MPEG Committee]
Basic Idea: Extend Block-DCT and Block-ME/MC-prediction
to code arbitrarily shaped objects
John G. Apostolopoulos
April 22, 2004 Page 53
54. Video
Coding
Example of
MPEG-4
Scene
(Object-based
Coding)
[MPEG Committee] John G. Apostolopoulos
April 22, 2004 Page 54
55. Video Example MPEG-4 Object Decoding Process
Coding
[MPEG Committee]
John G. Apostolopoulos
April 22, 2004 Page 55
56. Video
Coding
Sprite Coding (Background Prediction)
• Sprite: Large background image
– Hypothesis: Same background exists for many frames,
changes resulting from camera motion and occlusions
• One possible coding strategy:
1. Code & transmit entire sprite once
2. Only transmit camera motion parameters for each
subsequent frame
• Significant coding gain for some scenes
John G. Apostolopoulos
April 22, 2004 Page 56
57. Video
Coding
Sprite Coding Example
Sprite (background) Foreground
Object
Reconstructed
Frame [MPEG Committee]
John G. Apostolopoulos
April 22, 2004 Page 57
58. Video
Coding
Review of Today’s Lecture
• Motivation for compression
• Brief review of generic compression system (from prior lecture)
• Brief review of image compression (from last lecture)
• Video compression
– Exploit temporal dimension of video signal
– Motion-compensated prediction
– Generic (MPEG-type) video coder architecture
– Scalable video coding
• Overview of current video compression standards
– What do the standards specify?
– Frame-based video coding: MPEG-1/2/4, H.261/3/4
– Object-based video coding: MPEG-4
John G. Apostolopoulos
April 22, 2004 Page 58
59. Video
Coding
References and Further Reading
General Video Compression References:
• J.G. Apostolopoulos and S.J. Wee, ``Video Compression Standards'',
Wiley Encyclopedia of Electrical and Electronics Engineering, John
Wiley & Sons, Inc., New York, 1999.
• V. Bhaskaran and K. Konstantinides, Image and Video Compression
Standards: Algorithms and Architectures, Boston, Massachusetts:
Kluwer Academic Publishers, 1997.
• J.L. Mitchell, W.B. Pennebaker, C.E. Fogg, and D.J. LeGall, MPEG
Video Compression Standard, New York: Chapman & Hall, 1997.
• B.G. Haskell, A. Puri, A.N. Netravali, Digital Video: An Introduction to
MPEG-2, Kluwer Academic Publishers, Boston, 1997.
MPEG web site:
http://drogo.cselt.stet.it/mpeg
John G. Apostolopoulos
April 22, 2004 Page 59